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Abstract 

Many  systems,  composed  of  hardware,  software,  and  combinations  thereof,  function  in 
sequential  stages:  each  subsystem  (stage)  must  operate  correctly  in  order  for  the  next  to  be 
challenged.  All  stages,  including  the  interfaces  between  major  function  subsystems,  are 
subject  to  design  defects,  which  if  actuated  cause  that  stage,  and  hence  that  test,  to  fail.  We 
provide  models  that  evaluate  the  “testing  as  learning  and  improving”  paradigm:  the  models 
describe  the  effect  of  end-to-end  or  linked-stage  testing,  and  defect  identification  and 
removal,  on  field  or  delivered-system  reliability.  A  major  concern  is  the  evaluation  of 
operating  characteristics  of  such  test  designs  as  the  “first  run  of  r  total  system  successes  (e.g. 
3)”  stopping  rule.  The  models  include  Bayesian  formulations  in  which  the  unknown  number 
of  defects  in  each  subsystem  at  any  stage  during  testing  is  a  random  variable  with  known 
distribution.  The  models  and  methods  of  this  paper  provide  test  planners  with  the  answers  to 
“what  if’  questions  concerning  the  likely  future(s)  of  entire  systems  placed  on  test.  They  can 
be  used  to  address  test  resource  requirements. 
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1.  Introduction  and  Model  Formulation 

This  paper  provides  mathematical  models  of  reliability  growth  by  design  defect  or 
failure-mode  identification  and  removal  in  system  reliability  testing  and  management,  for 
instance  during  military  Test  and  Evaluation  (see  Seglie,  1998).  The  models  demonstrate 
how  testing  can  promote  early  learning  about,  and  rectification  of,  system  defects  in 
design,  manufacturing,  and  operations.  In  the  military  and  elsewhere,  such  testing  should, 
and  does,  begin  with  engineering-level  Developmental  Testing  (DT),  initially  of 
subsystems,  and  terminates  with  end-to-end  Operational  Testing  (OT).  At  present, 
attempts  are  being  made  to  compress  and  combine  DT  and  OT  so  as  to  shorten 
acquisition  time  and  decrease  its  expense.  The  models  proposed  are  intended  to  provide 
insight  to  modem  test  planners.  Software  that  exercises  the  models  is  available  from  the 
authors. 

The  model  structure  to  be  studied  is  the  following.  A  system,  S,  is  made  up  of  S 
(S  >  1)  subsystems  or  stages,  each  of  which  must  function  on  demand,  in  sequence,  for 
perfect  operation;  failure  of  any  subsystem,  especially  to  interact  with  another  subsystem 
(interaction  can  also  be  viewed  as  a  stage),  means  total  system  failure.  Demands  for 
subsystem,  or  inter-subsystem,  function  occur  in  order,  stagewise;  s  =  1,  2, ...,  S.  If  a 
demand  at  an  intermediate  stage/subsystem,  s,  succeeds,  i.e.  any  faults  do  not  activate,  a 
demand  is  placed  on  stage/subsystem  s  +  1;  if  all  such  demands  succeed,  the  entire  system 
operates  successfully  on  that  particular  test  or  usage  occasion  (it  may  not  again  if 
remaining  faults  activate).  That  is,  a  system  following  current  design  and  realization  must 
function  “end-to-end”  in  order  to  operate  reliably —  “suitably”  in  military  parlance.  This 
“success”  does  not  mean  that  the  particular  system  mission  is  necessarily  overall 
successful  or  “effective”  (a  reliable  weapon  may  not  accomplish  its  mission:  it  may  miss, 
or  hit  a  wrong  target).  Such  may  also  be,  in  part,  reliability  issues,  but  attributable  to 
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C4ISR  errors.  Note,  however,  that  the  design  defect  removals  we  aim  for  may  include 
those  in  basic  functionality  (“effectiveness”)  such  as  accuracy  and  lethality. 

To  perform  a  system-level  operational  test  of  S,  suitable  test  conditions  are  first 
established.  It  is  desirable  to  quantify  those  conditions  (weather  and  other  environmental 
effects,  pre-test  transport  and  setup  stress,  target  properties,  etc.).  This  can  be  done  by 
incorporating  explanatory  variables  to  represent  between-test  variations.  For  recent 
related  modeling  see  Bogdonavicius  and  Nikulin  (2000).  Under  given  conditions  let  each 
subsystem  possess  a  certain  (random,  or  at  least  unknown)  number  of  failure  modes  (or 
defects),  ds,  for  subsystem  s,  s<S.  These  modes  become  active  (cause  failure)  with 
probability  6S  if  a  demand  is  received  at  that  stage;  otherwise  are  inactive  or  survive  with 
probability  1  -  0S  =  0S.  In  order  for  the  5th  subsystem  to  experience  test,  and  hence 
possibly  reveal  a  failure  mode,  all  previous  i  e  (1,  2, ...,  s- 1)  subsystems,  and  their 
interconnection  and  transition  actions,  must  survive,  and  hence  transmit,  demands.  If  a 
failure  mode  in  a  subsystem  is  activated  (causes  failure),  the  design  or  execution  may 
well  be  modified.  Here  it  is  optimistically  assumed  that  the  failure  mode  is  removed,  and 
thus  “reliability  growth”  occurs,  but  this  simplicity  may  not  hold:  new  failure  modes  may 
actually  be  introduced,  and  bedrock  non-removable  failure  modes  will  remain.  These 
realities  are  ignored  for  simplicity  in  the  present  report,  so  the  results  are  likely  to  be 
optimistic.  We  also  ignore  the  detrimental  effects  of  system  aging  (one-shot  items 
eventually  age  on  the  shelf).  We  again  emphasize  that  in  operational  field  testing  it  is 
often  the  inter-subsystem  interactions  that  exhibit  surprising  new  failure  modes  which 
must  be  discovered  by  suitable  testing.  Our  model  can  cover  such  situations  by  simply 
defining  some  stages  as  “interaction  subsystems”. 

Here,  testing  the  complete  system  (e.g.  a  missile  or  an  information  system)  requires 
that  “early”  (5=1,2,...)  subsystems  survive  so  that  “late”  (s  =  . . .  5-1,  S)  subsystems  can 
experience  demand,  and  hence  literally  be  subjected  to  test.  Failures  of  early  subsystems 
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protect  later  subsystems  from  test;  this  effect  must  be  overcome  in  order  for  the  entire 
system  to  be  tested.  Engineering-level  or  developmental  tests  (DT)  of  the  individual 
component  subsystems  will  be,  or  have  been,  carried  out,  but  these  cannot  be  completely 
trusted  to  identify  all  failure  modes  that  may  appear  in  actual  operation  when  the  entire 
system  is  assembled  and  tested,  much  less  in  the  field.  In  the  ideas  we  explore  are  related 
to,  but  not  the  same  as  classical  burn-in ;  whereby  early  testing  removes  weak  components 
from  an  existing  population;  see  Block  and  Savits  (1997)  for  a  nice  review,  and  also  Lynn 
and  Singpurwalla  (1997).  Our  problem  emphasizes  design  burn-in :  systems  are  tested  and 
the  design  improved  before  a  population  of  manufactured  and  fielded  items  is  created. 
Members  of  that  population  can  possibly  then  experience  classical  bum-in  before 
fielding,  but  the  need  should  be  reduced  if  the  design  has  already  been  improved. 

2.  Operationally  Relevant  Questions 

Given  preliminary  values  of  the  parameters,  inferred  from  engineering  design  and 
experience  with  analogous  subsystems  and  systems,  it  is  operationally  meaningful  to 
address  such  questions  as  these  prior  to  starting  expensive  field  testing: 

(a)  After  a  given  number  of  system  tests,  what  is  the  (approximate)  probability  that 
the  system  will  operate  satisfactorily  (not  fail)  when  released  to  the  field  or  delivered  to  a 
user? 

(b)  How  many  tests  are  likely  to  be  required  to  achieve  the  first  (or  /h)  end-to-end 
success? 

(c)  How  many  tests  are  required  to  achieve  r  (e.g.  3,  or  5)  consecutive  end-to-end  test 
successes,  or,  in  statistical  parlance,  a  (first)  run  of  r;  a  possible  test  stopping  rule  that  is 
attractive  because  of  its  simplicity  and  intuitive  evidence  of  system  success? 

(d)  Suppose  testing  is  stopped  after  T  tests  (possibly  fixed,  or  random  governed  by  a 
stopping  rule  such  as  “first  occurrence  of  a  run  of  r  ( e.g .  r  =  3,5, ...  whatever)  successful 
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reliability  tests  of  entire  system”),  after  which  no  further  design  modifications  are 
contemplated.  What  are  the  failure  characteristics  of  the  system  if  fielded:  e.g.  what  is  the 
operational/field  probability  of  system  (reliability)  success?  For  a  previous  account  of 
this  measure  of  system  success  under  “reliability  growth”  see  Fries  and  Maillart  (1996). 
What  is  the  probability  that  the  system  completes  a  mission  that  requires  at  least  M 
successes  if  M+R  systems  are  allocated?  What  is  the  mean,  and  variability,  of  the 
number  of  tests  required? 

3.  Models  for  Discovery  of  Hidden,  or  Sequentially-Evident,  Design  Defects 

A  system  is  composed  of  a  number,  S,  of  subsystems  each  of  which  contains  an 
uncertain  number  of  failure  modes  (design  defects).  When  a  design  defect  is  activated 
during  a  test,  the  system  fails  at  that  subsystem,  and  that  particular  test  terminates.  Figure 
1  illustrates  the  configuration,  and  outcomes. 


(Start  Test) 


(End  Test  Successfully) 


(End  Test  Unsuccessfully  at  Stage  s ) 

Figure  1 


Let 

Ds{t)  =  number  of  design  defects  present  in  Stage  s  after  t  tests; 

Ds{ 0)  =  number  of  design  defects  present  initially  in  Stage  s  before  test  (or  at 
least  before  operational,  end-to-end,  testing).  At  this  stage  the 
distribution  of  Ds{ 0)  may  be  treated  as  a  Bayes  prior,  or  as  an  expression 
of  inherent  variability;  a  Bayes  prior  can  describe  hyperparameters. 
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Stages  are  request-activated  strictly  serially,  starting  with  Stage  1  and  ending  with 
Stage  S.  If  any  stage  fails  to  respond,  the  following  stage  is  not  demanded/request- 
activated  and  the  system  trial/test  fails.  The  Stage  s  request  activation  can  only  occur  if  all 
previous  stages,  z  =  1,2,...,  s-1  respond  to  their  request  activations.  (This  does  not  imply 
that  stages  that  successfully  respond  to  their  request  activations  are  free  of  defects  -  they 
may  well  randomly  activate  and  cause  failures  on  later  trials,  or  even  on  a  mission  after 
system  release.)  Repeated  testing  tends  to  remove  defects,  but  there  will  be  little  reward 
from  testing  long  enough  to  eliminate  defects  that  are  unlikely  to  activate  in  the  field. 

4.  Model:  Stage-wise  (Binomial)  Failures,  One-at-a-Time  Removable 

Invoke  the  stage-wise  failure  model,  and  allow  only  one  activated  design  defect  to  be 
identified  and  removed  per  test  (no  new  design  defects  are  introduced).  If  more  than  one 
failure  mode  or  defect  is  activated  on  a  test  we  assume  that  only  one  of  these  can  be 
identified  and  removed;  the  others  can  activate  again. 

There  follow  several  functional-equation  modeling  systems  that  respond  to  questions 
posed  earlier. 

4.1  Expected  Probability  of  System  Field  Success  After  a  Fixed  Number,  t,  of  Tests 

Let  D,{ 0)  be  the  initial  number  of  defects  in  stage  z,  i  =  1, ...,  S.  Assume  that  some 
defect  in  stage  z  is  revealed  during  a  test  with  probability  Pi(di);  qi(di)  =  \-pi(di) 
where  dt  is  the  number  of  defects  in  stage  z,  assumed  >  1.  A  special  case  is  qi{di)  =  Of , 

but  allowance  for  random  (extra)  variability  in  0  (e.g.  as  a  Beta  random  variable)  is 
natural  to  reflect  within-stage  variability  beyond  the  simple  binomial.  Note  that  this  is 
viewed  as  representing  physical  mixtures;  it  is  not  necessarily  a  Bayesian  prior.  A  defect 
revealed  in  stage  z  causes  the  system  to  fail  without  revealing  any  defects  in  later  stages; 
these  are  hidden  for  that  test.  Each  test  reveals  at  most  one  identifiable  defect  or  fault. 
Such  a  discovered  defect  is  assumed  removed  with  certainty  (with  probability  equal  to 
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one)  by  present  assumption  (if  it  is  successfully  removed  with  probability  p  then  we  may 
replace  the  probability  a  defect  is  discovered  and  removed,  e.g.  1  -  6  by  (1  -  6)p  and 
proceed).  Let  D,(0  be  the  number  of  defects  remaining  in  stage  i  after  t  tests.  Let  Qt  be  the 
probability  a  remaining  defect  in  stage  i  does  not  activate  while  the  system  is  put  in  use  in 
the  field  during  one  mission.  (Desirably,  «  or>  qt,  the  probability  a  test  does  not 

reveal  a  defect.)  Note:  for  initial  example,  but  not  throughout,  activation  of  some  design 
fault  or  defect  is  Binomial:  fi  {fit )  =  1  -  Of  .  It  is  possible  to  represent  extra-Binomial 

stage-to-stage  variability  by  mixing  within  stages  to  provide  extra-binomial  variability: 
E^df  J ;  see  Appendix  A. 

Define 

p(dx , . . .  ds ,  t)  =  P{Dx  (t)  =  dx,...,Ds(t)  =  ds},  (4.1) 

the  joint  probability  of  the  number  of  defects  present  in  each  stage  after  t  tests.  The 
following  forward  equation  (Markov  chain)  can  be  established  by  conditioning: 
p(dx,...,ds,t  + 1)  = 


s 

n 

i=l 

- V - 

No  design  defects 
removed  on  test  t 


S 

p{dx,...,ds,t)Y\qi(di)  +  Y,p{du...,di  +  l . ds,t )  j"j qj(dj)  [l-$(rf/  +  l)] 

'=1  J 


't-\ 


One  design  defect  removed  on  test  t 
(e.g.  from  stagey ,y=l,2,... ,5) 


(4.2) 


Note:  the  term  in  the  last  product,  (  )*  =  1  if  i  =  1 . 


p(dx,...,dsf))  = 


The  recursion  is  initialized  with 

1  ifD1(0)  =  d1,...,Ds(0)  =  ds 

0  otherwise 

The  probability  of  system  survival  in  the  field  after  t  tests  is 

s 


(4.3) 


Q{t)=  Y<p(d^d2’---’dS’t)Y[Q 


i  dj 

j 


(4.4) 


d\r-.4s 


7=1 
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Or,  more  generally,  with  field  probabilities  of  survival  allowed  to  differ  from  those  of  the 
test  and  experience  stage-wise  mixing, 


8(<)=  '£p{dud2,...,ds,l)flQj(dj).  (4.5) 

d:„..ds  7=1 

The  following  examples  are  based  on  simple  Binomial  stagewise  failures: 

?■«)=< if‘,  5(4)=??*. 


Prob  of  system  survival  after  t  tests 
Initial  number  of  defects:  3,  3,  3,  3 


0  2  4  6  8  10  12  14  16  18  20 

t:  number  of  tests 


—  e:.S.5.S.S  - «—  6:.25  .25  .75  .75  -»-e:.75  .75  .25  .25  | 

Figure  2 


The  graph  suggests  that  reliability  growth  in  a  several-stage  serial  system  is  not  likely  to 
have  the  characteristic  of  classical  one-stage  reliability  growth  models  of  Duane  and  later 
authors,  e.g.  Fries  (1993).  There  are  ample  physical  reasons  for  this  behavior.  They  also 
imply  that  more  rapid  and  complete  defect  elimination  and  hence  “reliability  growth” 
occurs  if  the  last-reached  system  stages  are  apt  to  fail  sooner  than  the  earlier-reached 
stages.  The  reason  is  that  need  to  re-test  the  last  stages  forces  more  tests  of  the  earlier 
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because  of  the  end-to-end  success  requirement.  It  is  unlikely  that  a  designer,  or  tester,  can 
ever  directly  influence  such  a  distribution  of  defects,  but  there  may  be  implications  for 
variations  in  the  intensity  of  component-level  testing:  one  might  tolerate  a  few  more 
faults  in  later  stages,  so  that  earlier  stages  will  be  subjected  to  more  operational  end-to- 
end  tests. 


4.2  Probability  of  Mission  Success  in  the  Field  if  Testing  Stops  After  First  Run  of  r 
Consecutive  Successful  Tests 

Suppose  the  system  test  is  stopped  when  there  are  r  (r  >  1,  e.g.  3)  successful  end-to- 
end  tests  in  a  row  (a  “first  run  of  r”).  A  test  with  this  stopping  rule  ensures  that  all  stages 
are  tested  at  least  r  times.  The  probability  of  system  survival  after  completion  of  the  test 
can  be  computed  using  a  backward  equation  as  follows.  Let  pr(d\, ... ,ds )  be  the 
conditional  probability  of  system  mission  survival  in  the  field  after  the  test,  given  that  the 

initial  numbers  of  defects  are  Z>i(0)  =  d\,  ...,Ds( 0)=  ds.  Use  the  previous  stagewise 
survival  probabilities,  §/(<?,•)  to  write 


s  V  s  _ 

pr{d\,d2,...,ds)=  nm 

v  /=i  J  /=i 

v - - - ' 

Run  of  r  successful  tests 
occurs  before  any  test  failures 


(4.6) 


(  s 


1- 


w=i 


S  f  (-1 


Z  UUdj)  I  (1-^(^))| 

i=l  W=! 


J 


i-l  n«<(<4) 

J=l 


-l-l 


J 


pr(du...,di-l,...,ds) 


No  r- run  during  first  r  tests 


Start  over  after  a  failure  at  stage  i  before  run  of  r  successful  tests  achieved 


Note:  (  )  =  1  if  i  =  1 


The  recursion  starts  with pr{ 0, ...,  0)  =  1,  and  thus  builds  up  to  any  desired  initial  load  of 
design  defects. 


9 


Here  the  field  survival  probability  is  assumed  equal  to  the  field  test  system  survival 
probability:  Q(d)  =  q(d). 


Note:  the  above  equation  permits  quick  numerical  determination  of  the  mean  or 
unconditional  probability  of  field  success.  Simulations  show  that  there  can  be  substantial 
difference  between  the  mean  and  the  actual  probability  of  success,  depending  on  fault 
survival.  The  following  forward  equation  can  be  used  to  calculate  the  distribution  of  the 
probability  of  system  survival  after  the  test. 

Let  yr(av...,as )  be  the  probability  that  there  are  au  i=  1,  ...,S  defects  remaining 
sometime  during  the  test.  The  probabilities  yr(a,,...,as)  can  be  obtained  recursively  as 
follows. 


yr(aU:.,as) 


5=1 


1- 


ff 

i=i 

VV »« 


#5(^  +  1) 


(( 


1- 


1=1 
VV  '*s 


qs{as+l) 


/5-1  V*  (4-7) 

n«M  [i-9*(«5+i)] 

1=1  J 


with  initial  condition  yr(dl,...,ds)  =  1  where  d ,■  is  the  initial  number  of  defects  in  stage  i, 


i=  1 , ... ,  5;  note  (  )*=lif^=l. 

The  probability  of  having  a,  remaining  defects  in  stage  i,  i  =  1, S  after  completion 
of  the  test  is 


yr(au...,as)  =  yr(ai,...,as ) 


Lf=i 


(4.8) 


These  probabilities  can  be  used  to  obtain  the  distribution  of  the  probability  of  field 
survival  after  the  test  is  completed. 
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Numerical  results  with  Bernoulli-trials  Stagewise  Variability- 

In  the  example  whose  results  are  displayed  in  Figure  3  #(*/,•)  =Q(di)  =0f‘ .  Note 

that  Figure  3  displays  considerable  robustness  of  mean  mission  survival  outcome  to 
number  of  design  defects  and  activation  probabilities:  often  the  mission  survival 
probability  exceeds  0.8-0.9.  Note  that  in  the  case  6\  =  0.75,  &i  -  0.25,  6$  =  0.75, 
6a.  =  0.25,  the  probability  of  mission  survival  after  testing  increases  slightly  as  the  initial 
number  of  defects  in  each  stage  increases.  In  this  case  the  larger  test  activation 
probabilities  #  =  1  -  6\  =  0.75  for  stages  2  and  4  result  in  more  testing  of  stages  1  and  3. 
Consequently,  the  design  defects  in  stages  1  and  3  are  more  apt  to  be  discovered  and 
removed;  the  probability  of  mission  survival  after  testing  increases  as  the  number  of 
defects  increases. 


r 

4.3  Mean  Time  to  Stop  After  Reaching  First  Run  of  r  Consecutive  Successful  Tests 

This  is  a  measure  of  the  time  cost  of  a  test  program  that  is  run-terminated.  Let 
rr(d\,d2, . ..,  ds)  be  the  conditional  expected  time  (number  of  tests)  until  a  run  of  r 
successes  first  occurs,  given  that  there  are  initially  dt  defects  in  stage  i.  Here  is  a 
backward  equation  for  this  performance  measure. 


(  s 

rr(di,d2,...,ds)  =  r  JJ $(</,) 
V  m  , 


r-run  after  r  tests: 
initial  r- run 


f  (  S 


V  M  J 


J 


r 

n= 1 


Probability  no  initial  run  of  r 


ri*W| 


n  +  Yjrr{dx,...,dj-\,...,ds)— - f 

7=1 


i-m*) 


Expected  number  of  tests  to  achieve  r-run,  given  failure  to  achieve  initial  r-run 


( s  y-V  ^ 

n  ^  ~  n  ^  & ) 

J  \  i=i  7 


V  i=l 


1- 


s  \ 

Tim 

\  i=i  J 


Probability  first  activation  after  n<r  tests, 
given  no  run  of  r 


Note:  (  )  =  1  if  y  =  0 
An  initial  condition  is 


(4.9) 


M 0,  0, ...»  0)  =  r 


(4.10) 


Numerical  results  with  stagewise  over-variability. 

In  the  examples  of  Figures  4,  6,  and  8  we  compare  the  probability  of  mission  success 
for  several  cases  when  (a)  the  stagewise  probabilities  6i  are  taken  as  invariable  (“fixed”), 
Bernoulli  trials  probabilities  vs.  (b)  the  stagewise  probabilities  are  themselves  variable, 
independently  for  each  stage  and  test.  This  extra-Binomial  variability  can  conveniently  be 
characterized  by  a  diffuse  beta  distribution  with  mean  equal  to  the  fixed  values  of  (a).  The 
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variability  in  (b)  represents  some  aspects  of  uncontrollable  between-test,  and  between- 
test-stage  conditions;  see  Appendix  A.  In  each  case  that  has  been  investigated  the 
probability  of  field  success  is  higher  for  (a),  “fixed”  or  controlled  probabilities,  than  for 
(b),  the  corresponding  stage-wise  mixed  probability.  Practical  considerations  suggest  that 
(b)  may  be  the  more  qualitatively  realistic,  because  of  the  likelihood  of  extra 
uncontrollable  variations  in  the  field.  Some  such  are  likely  to  be  roughly  common  to  all 
stages;  this  is  analyzed  in  Appendix  B. 

From  Figures  4,  6,  and  8  it  is  striking  that  the  order  of  the  defect  survival  probability 
occurrence  (which  may  be  practically  difficult  to  control  at  the  developmental  testing 
stage)  can  be  influential  at  the  final  field  survival  probability  level.  Once  again,  the  case 
of  Figure  8,  &.  0.75,  0.25,  0.75,  0.25  exhibits  improved  field  response  with  more  defects 
for  the  Bemoulli-trials  case,  but  not  for  the  over-variability  case  studied.  From  Figures  5, 
7,  and  9  it  is  seen  that  the  mean  times  to  achieve  a  success  run  of  3  for  the  different 
parametric  cases  are  remarkably  similar.  These  are  isolated  examples  only,  but  certainly 
promote  interest  in  run-like  rules. 
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Prob  of  field  success  for  a  test  with  stopping  rule  of  3  successes  in  a  row 


V) 

o 

o 

o 

rj 

co 

12 

o 


-a 

o 


Initial  number  of  defects  in  each  of  4  stages 
|  ♦  G=.5  ..5,  .5,  .5  S  beta  mixed:  a=.1,  b=.1  | 


Figure  4 


The  mixing  distribution  employed  in  Figure  4  is  symmetric,  but  with  high  weighting  near 
0  and  L  Such  an  environment  badly  penalizes  the  tester  if  there  are  many  (e.g.  5  or  more) 
defects  in  the  system  initially. 
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sviean  numoer  or  tests 


<5  successes  in  a  row 


0  2  4  6  8  10 


initial  number  of  defects  in  each  of  4  stages 
♦  6=5,  .5,  .5,  .5  if  beta  mixed:a=.1,b=.1 


Figure  5 


The  expected  times  to  complete  the  tests  in  Figure  5  are  remarkably  similar  for  these 
cases. 


0  2  4  6  8  10 

Initial  number  of  defects  in  each  of  4  stages 
,'♦9: -25 .75 .25  .75  Bbeta  mixed  (a,b):  (.3,.1),(.1,.3),(.3,.1),(.1,.3)  1 

Figure  6 

In  Figure  6  a  somewhat  diffuse  mixing  distribution  (Beta)  is  used  for  each  stage,  with 
means  located  at  the  “deterministic”  levels.  Once  again,  however,  the  stagewise  mixtures 
at  stages,  independent,  and  recalculated  independently  between  tests,  have  a  substantial 
degrading  effect  on  the  mean  probability  of  a  system’s  field  success  if  the  system  is 
accepted  after  a  run  of  3. 
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Mean  number  of  tests  to  obtain  3  successes  in  a  row 


Initial  number  of  defects  in  each  of  4  stages 


♦  9:.7S  .25  .15  .25Bbeta  mixed  (a,b):  (.1,.3),(.3,.1),(.1,.3),(.3,.1) 

Figure  9 


5.  Bayesian  Formulations 

A  natural  approach  to  the  uncertainty  concerning  the  numbers  of  design  defects 
initially  present  is  a  Bayesian  one  in  which  Dt{0)  is  treated  as  a  random  variable  with 
(prior)  distribution  IT  =  {E&,  d  >  oj ,  1  <  i  <  S.  In  what  follows  we  shall  suppose  that  the 
random  variables  A{0),  1  <  i  <  S  are  independent  and  that  the  conditional  model  for 
failure  discovery  and  removal  is  as  in  Section  4  with  qfai)  the  conditional  probability  of 
subsystem  i  success,  given  d.  defects  present.  In  such  a  setup,  consider  the  situation 
following  t  tests  of  the  system. 

Each  subsystem  i  will  have  its  own  history  Hit  =  {jcn,  xa,  •  •  xit)  where  each  xy  takes 
one  of  three  possible  values,  namely 

{subsystem  i  was  not  subjected  to  scrutiny  during  test  j  because  of  the 
failure  of  an  earlier  subsystem}  =  Oy, 


18 


{subsystem  i  was  subjected  to  scrutiny  during  test  j  and  operated 
successfully)  =  Sy,  and 

{subsystem  i  was  subjected  to  scrutiny  during  test  j  and  a  defect  was 
activated  and  removed)  =  Fy. 

The  first  of  these  cannot  occur  for  subsystem  1.  Let  n'(r)  be  the  inferred  (posterior) 
distribution  of  Dt{t)  upon  suitable  application  of  Bayes’  theorem.  Updating  is  described  as 
follows: 

—  (t),  if  =  On+ 1 

n^(r  +  l>  ocn^ (t)qi (d),  ifxit+l=Sit+1;  and  (5.1) 

«n^+1(l){l-|i-(rf  +  l)},  ifxft+1  =FiM. 

In  general  the  posterior  IT(f)  will  depend  upon  the  entire  history  Hn  and  in  particular  will 
depend  upon  the  order  in  which  successes  and  failures  occur. 

In  this  highly  complex  scenario  it  seems  reasonable  to  make  an  initial  search  for 
simplicity.  In  particular,  we  seek  conditions  under  which  each  depends  upon  Hit  only 
through  F,(t)},  where 

S‘(t) =  l{xij  =  Sij)  > 

M 

the  total  number  of  successful  operations  of  subsystem  i  during  t  tests  and  F,{t)  is 
similarly  defined  in  terms  of  failures  (i.e.  defect  activations  and  removals).  Expressed 
simply,  we  require  that  the  numbers  of  successes  and  failures  to  date  should  be  sufficient 
statistics  for  each  subsystem.  Work  by  Benkherouf  and  Bather  (1988)  in  the  context  of  oil 
exploration  implies  that  this  requirement  forces  a  conditional  model  of  the  form 

qi{d)  =  6di  ,d>  0,  for  some  <9;,  0  <  6(<  1 ,  (5.2) 

which  is  the  Binomial  case  of  Section  (4.1).  Until  indicated  otherwise  we  suppose  that 
(5.2)  holds. 
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Further,  Glazebrook  (1993)  introduced  a  family  of  prior  distributions  which  are 
conjugate  for  this  problem.  Let  D,{0)  ~  n'(0)  =  n(T„  $,  tfi),  1  <  i  <  S,  where  the 
probability  mass  function  (p.m.f.)  for  U(X,  6,  <f>)  is  given  by 

Tld{Wj)  =  no(X ,^rtAVrf(rf-1)/2|n(l-^)|  ,  d>0  (5.3) 

where  Flo  (A,  6,  <f>)  is  a  normalizing  constant.  The  parameter  space  associated  with  this 
family  is  {0  <  X  <  1,  0  <  9<  1,  <j>=  0}  u  {X  >  0,  0  <  6<  1,  <j>>  0}.  The  first  parameter  X 
may  be  interpreted  as  an  overall  rate  of  finding  failures  while  (f>  may  be  thought  of  as  a 
rate  of  depletion  of  defects  in  a  subsystem  under  failure,  and  subsequent  defect  removal. 
Parameter  Q  is  always  assigned  the  value  of  the  probability  in  the  conditional  Binomial 
model  in  (5.2).  Special  cases  of  this  model  are 

Tl(X,  9,  0)  =  E(X,  0)\  U(X,  0, 1)  =  H(X,  0),  (5.4) 

the  Euler  and  Heine  distributions  with  parameters  (X,  9)  respectively.  These  are  discussed 
by  Benkherouf,  Glazebrook,  and  Owen  (1992).  The  reader  should  note  that,  for  regions  of 
its  parameter  space,  the  Euler  distribution  may  approach  either  a  Poisson  or  a  geometric 
distribution.  Thus  the  prior  family  (5.3)  is  quite  versatile. 

With  the  prior  Tl(Xi,  Oi,  $)  in  (5.3)  and  the  conditional  model  (5.2),  the  posterior 
distribution  for  D,{t)  is  given  (upon  operation  of  (5.1))  by 

IT  (?)  =  ,  Qi ,  <f>i },  1  <i<S.  (5.5) 

From  (5.5),  the  situation  following  t  tests  is  such  that  the  overall  rates  of  defect 
detection  in  subsystems  have  fallen  to  the  values 

i  <  i  <  s ;  (5.6) 

it  is  reasonable  to  stop  testing  when  these  values  are  sufficiently  small. 
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5.1  Bayes  stopping  rules 

With  the  above  structures  in  place  we  can  design  stopping  rules  which  are  Bayes 
optimal  for  a  range  of  objectives.  In  such  problems,  a  major  difficulty  is  presented  by  the 
computational  demands  of  producing  a  fully  sequential  solution  via  dynamic 
programming.  However,  in  the  context  of  reliability  growth  in  which  the  number  of 
defects  is  reduced  (stochastically)  at  each  test,  one-step-look-ahead  (myopic)  rules  should 
perform  well.  See,  for  example,  Benkherouf  and  Bather  (1988),  Gittins  (1989)  and  Manor 
and  Kress  (1997)  for  examples  of  this  phenomenon.  Suppose  that  we  wish  to  maximize 
an  objective 


En{-cT+QT }  (5.7) 

where  T  is  the  number  of  tests  performed,  c  is  a  (suitably  standardized)  cost  per  test  and 

1,  for  successful  field  deployment  following  T  tests, 

0,  otherwise. 


Qt  - 


In  (5.7)  the  expectation  is  taken  with  respect  to  the  prior  distributions  (summarized  by  II). 

A  key  quantity  required  for  the  development  of  a  solution  to  (5.7)  is  the  predictive 
probability  of  successful  field  deployment  Q( S,  F)  at  a  point  in  which  the  data  from 
testing  are  summarized  by  sufficient  statistics  S,  F  =  {($,  Fi),  1  <  i  <  S}.  Utilizing  the 
above  independence  assumptions, 


e(S,F)-nfl(S.-R)  (5.8) 

1=1 

where  Qt{Si,  Fi)  is  the  predictive  probability  of  successful  field  deployment  of  subsystem  i 

with  sufficient  statistics  (St,  Fi),  1  <  i  <  S.  Taking  n‘(0)  =  11(2,,  6h  fi),  conditional  model 
(5.2)  and  Q(d)  =  8? ,  d >  0,  l  <  i  <  S,  we  deduce  from  (5.5)  that 
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We  write  E{Q{S,  F)}  for  the  expected  predictive  probability  of  successful  field 
deployment  following  one  additional  test,  starting  from  (S,  F).  Utilizing  (5.5)  and  (5.8) 
we  deduce  that 


;=i  (/'=i 


Hq(^)}=T \Y[QAsJ’fj)  {i-fl(s^)}ds+£i',F+i' 


(5.10) 


+  na^./v)  [ds+xi'.F 


where  in  (5.10),  l1  is  an  5,-vector  whose  Ith  component  is  one,  with  zeros  elsewhere.  A 
Bayes  myopic  stopping  rule  for  problem  (5.7)  concludes  testing  as  soon  as  the  gain  in 
system  reliability  from  one  further  test  is  less  than  the  cost  of  the  test.  Formally,  the 
associated  stopping  region  is 


[(S,  F);  E{Q( S,  F)}  -  Q( S,  F) }  <  c].  (5.11) 

When  c  is  small  (i.e.  the  cost  of  one  more  test  is  negligible  compared  to  the  utility  of 
having  a  system  fully  operational  in  the  field)  then  from  (5.9)  and  (5.10)  we  can  show  that 
the  stopping  region  in  (5.1 1)  may  be  well  approximated  by 

(S,F);j|;^f'+^(l-^)|e(S,F)<c  (5.12) 

(S,F);g^fi+^(l-^)<c|. 

See  the  comments  around  (5.6)  above.  The  simple  conservative  stopping  rules  above  will 
approximate  (5.1 1)  well  for  c  close  to  zero. 
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Numerical  example 

The  stopping  rule  in  (5.12)  was  applied  to  a  testing  problem  with  4  stages,  3  defects 
being  initially  present  in  each  stage.  Results  may  be  found  in  Table  1.  Four  different  sets 
of  theta  values  were  considered,  namely,  6^1):  0.75,  0.75,  0.75,  0.75,  6(2):  0.5,  0.5,  0.5, 
0.5,  fl(3):  0.25,  0.25,  0.75,  0.75,  and  6(4):  0.75,  0.75,  0.25,  0.25.  The  prior  distributions 
used  to  determine  the  stopping  rules  were  taken  to  be  Euler  in  all  cases.  For  these 
distributions  a  value  of  A  is  required.  We  explored  two  different  approaches  to  making 
this  choice.  In  approach  1  we  set  A  to  be  1  -  6^,  thus  guaranteeing  that  the  unconditional 
initial  subsystem  failure  probability  in  the  Bayesian  model  was  equal  to  the  (actual)  initial 
failure  probability  with  three  defectives.  Under  this  approach,  we  used  priors 
£(0.578, 0.75),  £(0.875, 0.5),  and  £(0.984,  0.25)  as  appropriate.  These  A  choices  are 
denoted  A{*\)  in  the  table.  Under  approach  2,  for  given  6  we  chose  A  such  that  the  mean 
of  the  E(A,  6)  distribution  was  3.  Under  this  approach,  we  used  priors  £(0.500, 0.75), 
£(0.672,  0.5),  and  £(0.734, 0.25)  with  the  A  choices  denoted  A(*2)  in  the  table.  Testing 
continued  until  an  appropriate  version  of  the  stopping  criterion  in  (5.12)  was  met  with  c 
set  equal  to  0.1,  0.06,  0.04,  0.02,  0.01.  The  smaller  the  value  of  c,  the  more  conservative 
is  the  resulting  test.  Each  case  (cell  of  the  table)  was  run  3,000  times.  The  results  are 
given  by  the  unbracketed  values  in  each  cell  of  the  table,  which  are  (reading  from  top  to 
bottom): 

(1)  actual  mean  probability  of  field  success  at  end  of  testing; 

(2)  mean  predictive  probability  of  field  success  under  the  Bayesian  model  at  end  of 
testing; 

(3)  mean  number  of  tests. 

The  bracketed  values  are  the  corresponding  standard  errors. 
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Table  1 

Probability  of  field  success  and  mean  number  of  tests  with  a  Bayes  myopic  stopping  rule 


C 

Scenano\ 

0.1 

0.06 

0.04 

0.02 

0.01 

0.515  (0.004) 
0.475  (0.001) 
11.569  (0.025) 

0.736  (0.004) 
0.712  (0.000) 
15.041  (0.014) 

0.842  (0.003) 
0.831  (0.000) 
17.415  (0.011) 

0.929  (0.002) 
0.927  (0.000) 
20.712  (0.009) 

0.966  (0.002) 
0.960  (0.000) 
22.970  (0.003) 

mx  i2)) 

0.437  (0.003) 
0.467  (0.001) 
10.562  (0.024) 

I 

mm 

0.925  (0.002) 
0.921  (0.000) 
19.906  (0.006) 

Mil 

mm) 

0.865  (0.004) 
0.777  (0.001) 
13.904  (0.006) 

MU 

0.938  (0.003) 
0.937  (0.000) 
15.872  (0.007) 

0.969  (0.002) 
0.969  (0.000) 
16.937  (0.004) 

0.987  (0.001) 
0.984  (0.000) 
17.973  (0.003) 

0.750  (0.005) 
0.801  (0.000) 
13.426  (0.014) 

0.867  (0.004) 
0.902  (0.000) 
14.712  (0.010) 

0.933  (0.003) 
0.952  (0.000) 
15.862  (0.007) 

0.968  (0.002) 
0.976  (0.000) 
16.935  (0.005) 

0.983  (0.002) 
0.988  (0.000) 
17.966  (0.003) 

0.270  (0.002) 
0.251  (0.002) 
7.952  (0.018) 

0.608  (0.005) 
0.564  (0.004) 
12.609  (0.053) 

0.843  (0.003) 
0.823  (0.000) 
16.677  (0.010) 

0.931  (0.002) 
0.921  (0.000) 
19.886  (0.006) 

0.968  (0.002) 
0.965  (0.000) 
22.871  (0.006) 

0.253  (0.002) 
0.313  (0.002) 
7.761  (0.014) 

0.691  (0.004) 
0.710  (0.000) 
13.738  (0.017) 

0.815  (0.003) 
0.832  (0.000) 
16.262  (0.014) 

0.918  (0.002) 
0.929  (0.000) 
19.673  (0.010) 

0.963  (0.002) 
0.961  (0.000) 
21.937  (0.005) 

mA 4i)) 

0.860  (0.004) 
0.857  (0.000) 
13.554  (0.013) 

0.908  (0.003) 
0.862  (0.000) 
13.694  (0.009) 

Bill 

0.947  (0.002) 
0.950  (0.000) 
15.792  (0.008) 

0.964  (0.002) 
0.964  (0.000) 
16.856  (0.007) 

(W(42)) 

0.860  (0.004) 
0.883  (0.000) 
13.553  (0.012) 

0.862  (0.004) 
0.883  (0.000) 
13.561  (0.012) 

0.918  (0.003) 
0.935  (0.000) 
14.706  (0.010) 

0.954  (0.002) 
0.957  (0.000) 
15.817  (0.008) 

0.967  (0.002) 
0.969  (0.000) 
16.867  (0.007) 

We  note  the  following  from  the  numerical  results.  The  larger  A- values  obtained  from 
approach  1  usually  result  in  slightly  longer  tests  than  those  resulting  from  the  smaller 
values  associated  with  approach  2.  The  final  predictive  estimate  of  field  success  tends  to 
be  slightly  conservative  (i.e.  an  underestimate)  on  the  average  for  approach  1,  but  tends  to 
be  slightly  optimistic  (i.e.  an  overestimate)  for  approach  2.  The  latter  is  not  surprising 
since  approach  2  adopts  priors  which  imply  an  overestimate  of  the  initial  probability  of 
field  success.  That  said,  the  results  give  encouraging  evidence  of  operating  characteristics 
which  are  robust  to  the  choice  of  lambda,  especially  so  when  c  is  small.  One  particular 
point  to  note  is  that  for  case  6(2),  the  characteristics  of  the  Bayes  rules  when  c  =  0.04  are 
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very  comparable  to  those  obtained  from  the  “3  successes  in  a  row”  stopping  rule.  The 
reader  is  referred  to  Figures  4  and  5  above. 

5.2  Stagewise  overvariability 

Appendix  A  and  Section  4  argue  for  a  modification  of  (5.2)  by  the  representation  of 
extra-Binomial  stage-to-stage  variability  in  the  conditional  model.  The  proposal  is  to 
replace  (5.2)  by 

qi(d)  =  E[ed)  with  G,,  1  <i<S.  (5.13) 

Recall  that  the  Binomial  model  (5.2)  was  required  for  the  simple  structures  above  based 
upon  sufficient  statistics  (S,  F).  We  conclude  that,  with  the  more  general  (5.13),  the 
posterior  distribution  IT(0  will  depend  upon  the  entire  history  Hit  =  {x;-i,  xq,  ...,  xit) , 
excepting  only  those  entries  Xy  equal  to  Oy.  Put  another  way,  IT(0  will  depend  upon  the 
complete  sequence  of  successes  and  failures  to  date.  It  emerges  that,  while  we  lost 
simplicity  of  structure  by  generalizing  in  this  way  we  make  important  advances  in 
applicability  of  the  model  and  in  addition  develop  a  rationale  for  run  tests  as  good 
stopping  criteria. 

We  focus  first  on  a  single  subsystem  and,  for  the  present,  drop  subsystem  identifier  i. 

The  subsystem  has  D( 0)  defects  initially  with  associated  (prior)  distribution  n  =  {T^, 
d>  0}.  The  conditional  model  is  q(d),  with  p{d)  =  1  -  q{d),  d>  0.  We  consider  a 
sequence  of  t  tests  during  which  the  subsystem  is  subject  to  scrutiny  upon  <p  +  'y'9^<J j , 

occasions  of  which  <p  result  in  failure  (and  defect  removal)  and  result  in  system 

success.  More  precisely,  G\  is  the  number  of  successes  before  the  first  subsystem  failure, 
crq* i  is  the  number  of  successes  following  the  last  (<pth)  subsystem  failure  and  aj,  2<j< 
<p,  is  the  number  of  successes  between  failures  (/-l)  and  j.  We  write  {o\,  cr2, ...,  cr^i, 
<p)  for  this  data  configuration. 
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By  repeated  application  of  (5.1),  the  posterior  probability  of  d  subsystem  defects 
remaining  following  these  t  tests  is  given  by 


^nd+tp\Y[p(d + j) 

U=i 


tp+ 1 

k= 1 


(5.14) 


Further  simplification  results  in  the  special  case 


q(d)  = 


1 


d  + 1 


which  results  from  taking  0~  t/[0,l]  in  (5.13).  See  (A.3)  in  Appendix  A.  The  posterior 
distribution  in  (5.14)  then  becomes 

p+i  ( 


d  +  l 
d  +  <p  + 


1 


\Ck 


d  4-  (p  +  2  —  k 


(5.15) 


We  now  perform  some  calculations  which  shed  light  upon  the  nature  of  updating  and 
reliability  growth  in  this  context.  A  key  focus  of  the  analysis  will  concern  how  the 
posterior  probability  of  system  survival  in  the  field  varies  with  the  data.  When  we  discuss 

the  full  system  we  shall  need  to  restore  the  subsystem  identifier  i.  Consider  now  two 
subsystem  data  configurations  {o\,  cr2,  ...,  a^i,  (p\  =  {ct,  (p)  and  {crj = 


Definition.  Data  configuration  {o,(p}  dominates  configuration  {a ',<£>}  if  ^  < 


The  above  definition  is  describing  a  partial  ordering  between  data  configurations  in 
which  the  dominating  sequence  has  the  same  (total)  number  of  successes  and  failures,  but 
has  the  failures  earlier.  Note  that  in  the  models  based  on  the  Binomial  conditional  model 
in  (5.2),  the  posterior  distributions  for  the  two  sequences  would  be  identical.  This  is  no 
longer  the  case. 
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We  now  generalize  the  material  around  (5.8)  by  writing  Qt( a\  (p1)}  for  the  predictive 
probabihty  of  field  success  for  subsystem  i  following  data  (a1,  (p‘),  1  <i<S,  where 

0{(aV)}=E  nlvSM’  (5.i6) 

d> 0  1 

The  corresponding  predictive  probability  for  the  system  as  a  whole  is 

1=1 

In  the  following  result  we  use  as  a  notation  shorthand  for  the  (posterior) 

distribution  for  the  number  of  defects  in  subsystem  i  following  data  configuration  (a',  <p‘), 
1  <  i  <  S. 


Theorem.  For  any  choices  of  prior  distribution  II'  and  conditional  model  (5.13)  for 
which  <fr(l)  =  Eci  (@)<l,  the  following  hold: 

(1)  If  {<?',  <p1}  dominates  {o'1  ,q>1}  then  I Tff,y  is  stochastically  smaller  than  ^  ; 

(2)  If  the  sequence  {Q{d),  d>0}  is  non-decreasing  and  {c‘,  q>}  dominates  {c r"  ,  1  < 

i  <  S,  then 


Q{(a'V)}  >  fl{(a'V)},  1  <  i  <  S , 

and  hence 


Ofa1  >9X)>(<*2 


Q^5'\(px),[<s'2 


(3)  If  the  sequence  {Q(d),  d  >  0}  is  non-decreasing  then  Qi{(&,  q})}  is  non-decreasing  in 
each  C7l 4j,  1  <j  <  <p‘  +  1, 1  <  i  <  S,  as  is  Q{{o\  (px),  (a2,  q?), ...,  (a5,  q>s)}. 

(4)  If  flo  >  0, 1  <  i  <  S,  then  during  a  run  of  r  successes  the  predictive  probability  of  field 

success  approaches  1  at  a  geometric  rate  in  r. 
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Proof. 


(1)  Let  j  be  such  that  cr'j  and  1  <j  <  <p‘.  Consider  configuration  jcr"  + 1-/+1  - l7} .  Direct 
application  of  (5.14)  shows  that 


n</!o"V 


^d\o'GV*'-V  ,q. 


=  Ki[o’i,(pi,j) 


qi(d  +  (p'  +l-y) 
qi[d  +  cpi  -j) 


EGi(ed+q,i+'-j) 
ECi[ed^-j)  ’ 


d>  0, 


(5.18) 


for  some  constant  K^a"  ,q>!  ,jj.  But  since  distribution  G,  has  support  contained  in  [0,1] 
it  is  straightforward  to  show  that  {0d+l ) / EGi  ) ,  6?  >  oj  is  a  non-decreasing 

sequence.  It  follows  immediately  from  (5.18)  that  the  distribution  n[j„+1,+t_1,  ^  is  smaller 
than  II i  in  the  likelihood  ratio  ordering  and  hence  also  in  the  stochastic  ordering. 


However,  we  can  move  from  (a'1,  <pl)  to  dominating  configuration  (or',  <pl)  by  means  of  a 
sequence  of  transitions  of  the  form  (cj1  +1j'+1  for  some  j.  This, 


together  with  the  transitivity  of  stochastic  ordering  yields  (1). 

(2)  is  a  simple  consequence  of  (1). 

The  proof  of  (3)  involves  straightforward  application  of  (5.14)  and  is  omitted. 

(4)  A  run  of  r  successes  means  that  each  subsystem  data  configuration  is  (r, 0).  By  (5.14) 
we  have  that 


Qi{(r  >o)}  ^  n'0|r  o  - 


nb 


n'o 


2ni {q,(d)Y  ni+{i-nj}{?;(i)}' 


(5.19) 


d>  0 

where  inequality  (5.19)  utilizes  the  decreasing  nature  of  the  sequence  ^EGj(9d^,  d  >  oj  = 
[qi(d),  d  >  0}  .  From  (5.19)  we  deduce  that  that  predictive  probability  of  field  success  for 
the  whole  system  is 
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1 


1*1 


s 

1 

1=1 


-£#W 


m  n-0 


{#(1)} 


and  the  result  is  a  straightforward  consequence. 

The  conclusions  of  the  above  result  are  strongly  suggestive  that  runs  tests,  while  not 
being  Bayes  optimal  in  the  formal  sense  above,  should  nevertheless  provide  simple  and 
effective  designs  for  a  range  of  reasonable  cost  criteria.  We  discuss  prior  analyses  of  such 
tests  later. 


5.3  Bayes  confidence  regions 

A  natural  focus  for  inference  following  testing  is  the  unknown  parameter  _p(system 

s  _ 

survival  in  the  field).  Suppose  that,  as  in  (4.4),  this  takes  the  value  ]^[  Q  (dt )  when  the 

i=l  ' 

(unknown)  number  of  defectives  remaining  in  subsystem  i  following  testing  is  dt,  1  <  i  < 
S.  The  case  Q(d{)  =  Qdi ,  1  <  i  <  S,  and  Q  is  a  constant  is  particularly  simple  and  we 
consider  this  first.  In  this  model  the  probability  of  field  survival  is  Qi=l  . 

Let  IT  be  the  posterior  distribution  for  the  number  of  defective  modes  remaining  in 
subsystem  i  following  testing.  The  IT ,  1  <  i  <  S,  yield  ft ,  the  posterior  distribution  for 
the  number  of  defective  modes  remaining  in  the  entire  system  following  testing.  For 
given  a>  0,  let  D(a)  be  given  by 


D{a)  =  min  J;^fl„>l-a 


V  n=o  J 

then  {1,  Q, ...,  is  a  100(1  -  a)%  Bayes  confidence  region  for  the  parameter  of 

interest. 
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In  general,  we  need  to  work  with  an  ordering  of  the  quantities  { ,G(4),rf.2  o, 
di>0, . . .,  ds>  0}.  Call  the  (ordered)  members  of  the  collection  1  =  Q()>QX  >Q2  >... 
with 

3r=|d;nQ(rf.)  =  &| 

and 


dedr  l  i=l  J 

for  the  corresponding  posterior  probability.  For  given  a>  0,  let  r(a)  be  given  by 


r{a)  =  min  r^Pn  t\-a  , 


V  «=o  J 

then  jl >Qi,Q2,---,Qr(a)}  is  a  100(1  -  a)%  Bayes  confidence  region  for /?(system  survival 
in  the  field). 


5.4  Prior  analysis  of  test  designs 

As  in  Section  4,  proposed  test  designs  may  be  assessed  by  means  of  a  prior  analysis 
(i.e.  in  advance  of  the  tests)  focusing  on  such  key  measures  as  the  mean  />(system  survival 
in  the  field)  following  the  test,  the  mean  time  to  the  conclusion  of  testing,  and  the 
probability  that  the  field  probability  of  success  is  greater  than  1  -  a.  From  a  Bayesian 
viewpoint,  the  appropriate  measures  will  be  expectations  taken  with  respect  to  the  prior 
distributions  IT,  1  <  i < S.  Suppose  that  the Pr(d\,  di, ...,  d$)  are  available  for  J,  >  0,  1  < 
i  <  S,  by  the  computations  described  in  Section  4.2.  Then 

S{nn'(rf,)U(<M2 . ds)  (5.20) 

is  the  appropriate  measure  of  say,  the  mean  ^(system  survival  in  the  field)  following  a 
“run  of  r”  test.  The  summation  in  (5.20)  is  over  all  dt  >  0, 1  <  i  <  S. 
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In  the  case  of  very  diffuse  priors,  implementation  of  (5.20)  may  be  computationally 
expensive.  Simpler  alternatives  exist  for  some  of  the  specially  structured  models 
described  at  the  beginning  of  this  section.  Consider,  for  example,  a  situation  in  which 
n'  =  E(Xi,  0j),  1  <i<S,  and  we  have  the  conditional  Binomial  model  of  (5.2).  From  (5.9) 
we  conclude  that,  since  $  =  0  for  this  choice  of  prior,  we  have 


£(i-0?)ni  =  i-i,,  isiss. 

d>0 


for  the  unconditional  probability  of  subsystem  i  success  initially.  When  we  further 
assume  that  the  ^(system  survival  in  the  field)  takes  the  conditional  form  nfX'.we 

then  have  for  the  mean  probability  of  system  field  success 


Z{Yl^Pr(dud2 . ds).Qr{Xl,X2,...,XS) 

/=1  lM  J  is  1 

^  r'r’  ''  .  V  * 

Run  of  r  successful  tests  occurs  before  any  test  failures 


T  S  t-\  f  S 


+nn 


*=i  /=i  t'=\  L  i=i 


fi(i-vr) 


7=1 


Start  over  after  a  failure  at  stage  i  during  test  t<r 


and  Q.(0, 0,...,0)  =  1. 

Numerical  example. 

Table  2  reports  results  from  a  numerical  study  of  the  probability  of  system  field 
success  after  a  test,  which  ends  with  r  successes  in  a  row.  The  system  consists  of  4  stages. 
Given  ds  defects  in  stage  s,  s  —  1, . . .,  4,  the  conditional  probability  that  the  system  passes 

4 

one  test  is  where 

i=i 


«,(<*.)-  4^1= 


r(*,+»,)  r (t,+d,) 

r(b,)  r (a,+b,+d,) 
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with  Gs  having  a  beta  distribution.  Two  cases  of  randomized  Gs  are  considered.  In  case  A, 
each  theta  is  drawn  from  a  uniform  distribution  independently  for  each  stage  and  test.  In 
case  B,  Gs  is  drawn  from  a  beta  distribution  with  mean  bs  /  (as  +  bs )  for 


'(•9,1) 

for  s  =  l, 

(.7,3) 

for  s  =  2, 

(-3,7) 

for  s  =  3, 

(.1,9) 

for  s  =  4 

In  all  but  three  cases,  the  field  probability  of  system  success  is  where  ds(r) 

5-1 

is  the  number  of  defects  remaining  in  stage  s  after  the  test  is  complete.  The  initial 
numbers  of  defects  in  each  stage  are  independently  drawn  from  Poisson  distributions  with 
the  means  noted  in  the  table.  There  are  25  replications  for  each  case.  Displayed  are  the 
mean  of  the  mean  probability  of  system  field  success,  the  means  of  the  probabilities  that 
the  probability  system  field  success  after  the  test  is  greater  than  or  equal  to  0.7,  0.8,  0.9, 
and  0.95,  and  the  mean  of  the  mean  number  of  tests  required  to  obtain  a  run  of  r 
successes.  The  standard  errors  of  the  means  appear  in  parentheses  underneath  the  means. 

The  distribution  of  the  0S,  s  =  1, ...» 4  has  a  great  effect  on  the  probability  of  successful 
field  performance  after  the  test.  In  case  B,  the  defects  in  stage  4  are  less  likely  to  reveal 
themselves  during  the  test.  Thus  for  case  B,  the  probability  of  field  success  after  a  test 
until  a  run  of  3  successes  is  smaller  than  for  the  case  of  uniformly  distributed  Gs, 
5=1,  ...,  4. 

The  initial  mean  number  of  defects  in  each  stage  also  affects  the  probability  of  field 
success.  The  case  with  5  defects  in  stage  4  has  the  smallest  mean  of  the  mean  probability 
of  field  success  after  a  test.  The  mean  of  the  mean  probabilities  of  field  success  after  a 
test  until  there  is  a  run  of  7  successes  in  a  row  is  0.66  for  this  case. 
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The  mean  of  the  mean  number  of  tests  needed  to  obtain  a  run  of  r  successes  for  the 
cases  displayed  is  somewhat  insensitive  to  the  pattern  of  the  initial  mean  number  of 
defects  in  each  stage  and  the  probability  of  defect  discovery  during  test. 

In  all  but  three  of  the  cases  the  probability  of  a  defect  in  a  stage  causing  failure  during 
use  in  the  field  is  0.8,  which  is  different  than  these  probabilities  during  testing.  In  the 
three  cases  in  which  the  probability  of  field  success  is  the  same  as  that  in  testing,  the 
mean  probabilities  of  field  success  are  higher.  It  is  important  to  design  tests  so  that  they 
represent  field  conditions  as  closely  as  possible. 
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Table  2 

Mean  of  summary  statistics  for  simulations  of  testing  until  obtain  a  run  of  r  successes 


mean  of 
over-  the  mean 
var  number  of 


mean 

mean 

mean 

of 

prob 

prob 

mean 

that  the 

that  the 

prob 

prob 

prob 

mean 
prob 
that  the 

mean 
prob 
that  the 

# 

prob  surv 
in  field  for 

prob 

prob 

sue 

each 

field 

field 

in  a 

remaining 

surv>.9 

surv>.95 

row:r 

defect 

(0.01)  (0.03)  (0.03)  (0.05)  (0.05) 


(0.01)  (0.00)  (0.01)  (0.02)  (0.05) 


0.8 


.8 


5  0.8  B 


0.8  B 


7 

field  prob 
same  as 
testing 

3 

0.8 

.8 


0.8  B 


7 

field  prob 
same  as 
testing 

B 

23 

(0.82) 

3 

0.8 

A 

14.19 

(0.72) 

3 

0.8 

B 

13.31 

(0.67) 

5 

0.8 

B 

17.02 

(0.84) 

7 

0.8 

B 

20.22 

(0.95) 

7 

field  prob 
same  as 
testing 

B 

20.22 

(0.95) 

34 


Summary  and  Conclusions 

In  this  paper  we  consider  models  of  overall  system  testing  to  achieve  reliability 
growth  by  design  defect  identification  and  removal.  This  is  sometimes  referred  to  as  Test- 
and-Fix  (TAF).  We  consider  a  system  with  S  stages  in  sequence;  if  a  test  reveals  a  defect 
in  stage  s,  the  later  stages  s  +  1, . . S  are  not  subjected  to  the  test.  We  assume  that  at  most 
one  defect  is  removed  per  test. 

A  sequential  test  plan  that  ensures  that  all  the  stages  will  be  tested  at  least  r  times  is  to 
test  until  there  is  a  run  of  r  consecutive  system  successes.  A  system  success  means  that  all 
the  stages  operate  successfully  during  the  test,  which  implies  that  the  propensities  to  fail 
of  remaining  design  defects  is  likely  to  be  small.  Results  obtained  for  a  Bayesian  model 
formulation  suggest  that,  while  not  being  Bayes  optimal  in  a  formal  sense,  a  runs  test 
provides  a  simple  and  effective  test  stopping  rule  for  a  range  of  reasonable  cost  criteria. 

We  propose  analytical  procedures  to  calculate  the  mean  probability  of  field  system 
survival  after  successful  completion  of  a  runs  test,  the  distribution  of  the  probability  of 
system  field  survival  after  a  successful  runs  test,  and  the  mean  number  of  individual 
system  tests  required  to  achieve  a  run  of  r  successes,  and  hence  test  termination. 
Numerical  studies  indicate  that  the  probability  of  system  field  success  after  a  runs  test  can 
be  quite  sensitive  to  the  probabilities  that  a  test  activates  faults  in  each  of  the  stages. 
However,  the  mean  number  of  tests  required  to  obtain  a  run  of  r  successful  tests  appears 
to  be  relatively  insensitive  to  these  activation  probabilities.  This  suggests  that  it  is 
important  to  design  operational  tests  so  that  the  test  mimics  field  operation  of  the  system 
as  closely  as  possible. 

The  procedures  of  this  paper  have  been  programmed  in  Visual  Basic  and  Excel.  The 
software  is  available  from  PAJ.  Exercise  of  such  software  can  provide  guidance  to  test 
planners  and  analysts. 
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APPENDIX  A 

Models  for  Stage-wise  Over/Extra  Variability 


Generalize  the  initial  binomial  stagewise  (sub)model  by  randomizing  9S:  replace  6S  by 
the  random  variable  Os  and  replace  by  E\0ds*  j  =  qs(ds) . 

Beta  Mixing: 


E[6d/  =  qs(ds)  = 


r(a,+k) 

r(a,)r&) 


J xa*  ^l-x)6*  '( l-x)dsdx  = 
0 


rfo+k)  rfc+^) 

r(6s)  r(a5  +bs  +ds) 


(A.l) 


One  reasonable  normalization  is 


8,  =  ~T~  (A-2) 

as  +bs 

where  6S  is  the  original  “deterministic”  survival  probability,  i.e.  put  qs(l)  =  6S. 

The  above  model  simultaneously  chooses  the  same  random  value  for  each  defect  in  a 
stage  for  each  visit  to  the  stage,  and  is  independent  between  stages.  For  a  uniform 
distribution  ( as  =  bs=  1), 

(A.3) 


while  the  corresponding  non-mixed  version  is 


,  the  latter  decreasing  much  more 


rapidly  than  the  former. 

Alternative  (“Transform”)  Mixing: 
If 


qs(ds)  =  E[od/] 


=  E 


-ds(- In  6. 


I’ 


(A.4) 


the  Laplace  transform  of  the  positive  random  variable  (-In#*);  here  are  some  tractable 
possibilities: 
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Gamma  Mixing: 


The  Laplace  transform  of  F~  Gam(/?  =  shape,  jx  =  mean)  is 


(A.5) 


so  if  7=  -In 0,  put  £=  d  to  find 


q(d)  = 


1 


'i+*r 


(A.6) 


V 


If  //  =  /?=  1  the  result  equals  the  uniform  (Beta)  result,  but  the  Gamma  result  above  is 
more  flexible.  Normalizing  at  d=  1, 


*(!)  = 


1 


1+* 


=  0. 


(A- 7) 


If  ft  is  an  optional  tuning  parameter, 


(A.8) 


Stable  Law  Mixing: 

The  Laplace  transform  of  a  positive  stable  law  (see  Feller,  1966)  is 

:V» 


-e  ^  ,  fora>0  and  0</?<l 


(A.9) 


If  F=  -ln0,  then  q(d)  =  e  ^  .  Normalize  at  d=  1  to  get 

q{d)  =  GdP  (A.  10) 

where  #is  the  “deterministic”  probability  of  defect  survival. 

Inverse  Gaussian  (IG)  Mixing  (see  Johnson,  Kotz,  and  Balakrishnan,  1994): 

The  IG  distribution  is  that  of  the  first-passage  time  of  a  Brownian  motion  with  drift.  If 
Y—  -lnf?then 
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q(d)  =  E^6d^  =  E^e  rfy]  =  exp  --|(l  +  2c/^)2 -lj  (A.ll) 

where 

c  =  (A.12) 

M 

the  latter  being  the  coefficient  of  variation  of  (-In#).  q(d)  =  if  c  — *  0.  The  q{d) 
depends  on  the  tuning  parameter  c. 


APPENDIX  B 

Effects  of  Test-to-Test  Variability 


Let  T(d\,  d2,  ds\  s)  denote  the  random  number  of  tests  to  achieve  a  run  of  r 

successes  for  the  first  time,  conditional  </,  defects  being  initially  present  in  stage  i, 
S  and  on  the  environmental  test-specific  random  variables  £  these  latter  are 
assumed  positive,  independently  sampled,  and  held  fixed  for  each  entire  test;  they  thus 
represent  test-to-test  variability,  which  can  be  random  (as  here),  but  also  deterministic, 
explanatory/regression  variables.  It  can  be  seen  that,  conditional  on  s  components, 


,d2  y.  .  .  ,dj  .  'yd§  ,  sj 

=  n+T(di,d2,.",dj  -1  ,...,ds;£!_)  for  n  =  1, 2, r;j=  1, 2, S,  with  probability 

n(»wr  A(»(*)r-n(*(*)r^ 

i'*l  1=1  1=1  ;=1  V  J 

s  s  s 

=  r  with  probability  *IT[(§r(rf/))*r  ■  (B.l) 


i=i 


i=i 


The  next  steps  lead  to  finding  the  mean  time  to  first  attain  a  run  of  r  test  successes, 
thus  stopping  the  overall  test. 

First,  take  the  expectation,  conditional  on  s : 


E[T(dx,d2,..',ds-,§)] 


. <>,- 1 . 

n~\  k=\  /— 1  i=l  v  ^ 


The  conditional  (on  e )  expected  time  to  run  of  r,  when  there  is  no  run  of  r  in  first  r 
(run-breaker  occurs  at  Stage  j;  next  test  starts  over  with  dj- 1  defects  in  Stage  j) 


■  r fl (^' (d‘ ))£'  fl (&(d‘ ))"2  •  •  •  fl (di )r 


(B.2) 


1=1 


i=l 


1=1 


The  conditional  expected  value  of  run  length, 
when  the  run  occurs  on  the  first  r  tests 


Next,  remove  the  condition  on  s,  noting  that  the  d  appearing  in  T(du  d2, ...,  dr  1, . . ., 
ds\  d)  refers  to  future  tests,  and  is  here  assumed  independent  on  all  before;  this  is  a 

i 
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plausible  convenience  but  not  a  necessity.  Taking  expectations  or  removing  the 
^-condition  under  the  iid  assumption  yields 


rr{d{,d1,-'.,ds)  =  E£E[T{d[,d2,...,ds-,§ij\ 

r  S 

=  YT{n  +  r(du...,dj-l,...>ds)}(Ms(dud2,...,ds))n'\Mj4dud2,...,ds)-Mj{d1,d2,...,ds)) 

rt= 1  7  =  1 

+r(Ms{dud2,...,ds))r  (B‘3) 

where 


Mj-\{dx,d2 


,ds)  =  Et 


timr 


;=1 


£ 


v  /=i  ; 


(B.4) 


and 


Ms(d],d2,...,ds )  =  Et 


nm) 

L«'=i 


=  Ef 


(  S 


\  i=! 


(B.5) 


The  latter  expressions  can  be  evaluated  in  terms  of  the  Laplace  transform  of  the 
^-distribution.  Many  such  transforms  of  distributions  are  simple  and  explicit;  see 
Appendix  A  for  examples. 

Let  R(d\,  d2, ....  ds;  i)  denote  the  random  probability  of  success  after  passing  a  test 
stopped  after  a  run  of  r  successes,  again  conditional  on  test-specific  environmental 
random  variables  s,  fif)  refers  to  field  environments,  which  may  differ  from  those  of  the 
tests. 
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R(di,d2,...,ds;s)  =  R(dud2,...,di-l,...,ds;£) 
for  n  =  1,  2, . . r  - 1  and  i  =  1, 2, . . S  with  probability 

i=l  1=1  i=l 


failure  before  run  of  r;  start  over 


=  n(Q(^')rUj  with  probability  H  ))£" 

1=1 


*t(f) 


s 

1 

i=  1 


S 

n 

i=l 


no  failure  during  run  of  r 

Remove  conditions  on  sx  (iid)  and  sum,  using 


Mj(d\,di,...,ds)  =  E£ 


expj  (~  hi  Qi  {di ))  j 

i=i 


j  -  l,2,...,iS 


Ms,f(d\,d2,...,ds)  =  E£(fl 


exp|-^Z(-lnG(^)) 

Pr  {dl,d2,...,ds)  =  EeE[R{dud2,...,ds;s)] 

=  (Ms(dud2,...,ds))r  Ms,f(d},d2,...,ds)  +  YJ(Ms(dud2,...,ds)) 


run  of  r  successful  tests  before  any  failures 


n= 1 


i-(Ars(^i,-A))r 
\-Ms{d)r..4s ) 


no  r-run  during  first  r  tests 
S 

x^?r(di,d2)...,di-l,...,ds)[Mi-i(dud2,...,ds)- Mi(dud2,...,ds)] 

i=i 


start  over  after  failure  at  some  stage  (i)  before  achieve  run  of  r;  remove  defect,  continue 


The  following  figures  display  the  important  role  that  the  presence  of  environmental 
variability  may  play  in  the  ability  of  operational  testing  to  result  in  the  fielding  of  reliable 
systems. 

Figure  B.l  displays  probabilities  of  system  field  success  for  a  system  that  has  been 
tested  until  there  is  a  run  of  5  successes.  The  testing  environmental  random  variables 
have  a  gamma  distribution  with  mean  1  and  shape  parameter  0.5.  The  field  environmental 
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random  variable  has  a  gamma  distribution  with  mean  1  and  shape  parameter  beta,  which 
has  been  made  widely  variable.  Note  that  the  smaller  beta  is,  the  greater  is  the  probability 
of  field  system  survival.  In  the  present  case  the  testing  environment  is  variable  enough  to 
produce  an  effect  that  is,  in  the  quite  disparate  field  conditions,  quite  insensitive  to  the 
distribution  of  random  field  effects. 

Probability  of  system  survival  after  completing  test  until  a  run  of  5  successes  in  a  row 

0=0.5  0.5  0.5  0.5 

Test:  Gamma  distributed  environment  mean=1  and  shape  parameter=0.5 
Field:  Gamma  distributed  environment  mean=1  and  shape  parameter  beta 


Initial  number  of  defects  in  each  of  4  stages 

•  field  environ  beta=0.1  Afield  environ  beta=0.5  afield  environ  beta=1  Afield  environ  beta=5  [ 

Figure  B.l 

In  contrast  to  Figure  B.l,  Figure  B.2  (respectively  B.3)  displays  probabilities  of  field 
success  (respectively,  the  number  of  tests  to  obtain  5  successes  in  a  row)  for  a  system  that 
has  again  been  tested  until  there  is  a  run  of  5  successes  in  a  row.  Here  the  field 
environmental  random  variable  has  a  gamma  distribution  with  mean  1  and  shape 
parameter  equal  to  0.5.  The  test  environment  random  variables  have  a  gamma  distribution 
with  mean  1  and  variable  shape  parameter  ft.  The  small  shape  parameter,  /?=  0.1,  results 
in  smaller  mean  number  of  tests  required  but  at  the  price  of  a  smaller  probability  of  field 
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success.  The  reason:  a  gamma  density  function  with  fi=  0.1  has  most  of  its  mass  close  to 
0.  Thus,  most  of  the  time  the  probability  that  a  defect  is  revealed  during  a  test  is  close  to 
0,  and  the  test  is  over  too  soon  to  eliminate  many  faults.  However,  since  the  field 
environment  random  variable  has  a  shape  parameter  equal  to  0.5,  the  defects  remaining 
after  the  test  is  completed  are  likely  to  be  triggered  in  the  field. 


Prob  of  system  survival  after  completing  test  until  run  of  5  successes  in  a  row 

6=0.5  0.5  0.5  0.5 

TestGamma  distributed  environment  mean=1  and  shape  parameter  beta 
Field:  Gamma  distributed  environment  mean=l  and  shape  parameter=0.5 


Initial  number  of  defects  in  each  of  4  stages 

[  ♦  test  envir  beta=5  ■  test  envir  beta=1  A  test  environ  beta=0.5  » test  envir  beta=.3  •  test  envir  beta=0.l 

Figure  B.2 

Variable  test  environments  that  allow  a  disproportionate  number  of  excessively  benign 
environments,  even  though  balanced  by  some  that  are  excessively  stringent,  can  thus 
severely  bias  the  quality  of  the  fielded  product.  This  is  only  common  sense,  but 
quantified. 
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Mean  number  of  tests 


Mean  number  of  tests  needed  to  obtain  5  successes  in  a  row 
Test:  Gamma  distributed  environment  mean=1  and  Shape  Parameter=beta 
Field:  Gamma  distributed  environment  mean=1  and  Shape  Parameter=0.5 


Number  of  defects  in  each  of  4  stages 


|<  test  envir  beta=.3  A  test  environ  beta=0.5  a  test  envir  beta=1  ♦testenvirbeta=5  »test  envir  beta-0.1 

Figure  B.3 
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